Exploiting Information Needs and Bibliographics for Polyrepresentative Document Clustering
نویسندگان
چکیده
In this paper we explore the potential of combining the principle of polyrepresentation with document clustering. Our idea is discussed and evaluated for polyrepresentation of information needs as wells as for document-based polyrepresentation where bibliographic information is used as representation. The main idea is to present the user with the highly ranked polyrepresentative clusters to support the search process. Our evaluation suggests that our approach is capable of increasing retrieval performance, but performance varies for queries with a high or low number of relevant documents.
منابع مشابه
Polyrepresentative Clustering: A Study of Simulated User Strategies and Representations
The principle of polyrepresentation and document clustering are two established methods for Interactive Information Retrieval, which have been used separately so far. In this paper we discuss a cluster based polyrepresentation approach for information need and document based representations. In our work we simulate and evaluate two possible cluster browsing strategies a user could apply to expl...
متن کاملA Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملInvestigation through and Clustering the Information Needs and Information Seeking Behavior of Seminary and University Students of Khorasan-e- Razavi with Neural Network Analysis
Background and Aim: This study aims to investigate and clustering the information needs and information seeking behavior of seminary and university students using neural network analysis in Khorasan-e- Razavi. Methods: The quantitative study is an applied and descriptive survey conducted with neural networks analysis. Data were collected by a questionnaire based on the information needs and inf...
متن کاملHierarchical Fuzzy Clustering Semantics (HFCS) in Web Document for Discovering Latent Semantics
This paper discusses about the future of the World Wide Web development, called Semantic Web. Undoubtedly, Web service is one of the most important services on the Internet, which has had the greatest impact on the generalization of the Internet in human societies. Internet penetration has been an effective factor in growth of the volume of information on the Web. The massive growth of informat...
متن کاملخوشهبندی فراابتکاری اسناد فارسی اِکساِماِل مبتنی بر شباهت ساختاری و محتوایی
Due to the increasing number of documents, XML, effectively organize these documents in order to retrieve useful information from them is essential. A possible solution is performed on the clustering of XML documents in order to discover knowledge. Clustering XML documents is a key issue of how to measure the similarity between XML documents. Conventional clustering of text documents using a do...
متن کامل